U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

ERX3267468: Illumina NovaSeq 6000 paired end sequencing
13 ILLUMINA (Illumina NovaSeq 6000) runs: 721.4M spots, 216.4G bases, 40.9Gb downloads

Submitted by: NYGC
Study: 30X whole genome sequencing coverage of the 2504 Phase 3 1000 Genome samples.
show Abstracthide Abstract
We sequenced all 2,504 samples from the 1000 Genomes (1KG) Project to a minimum of 30x mean genome coverage. Though a small number of 1KG samples had been sequenced to high coverage previously, we sequenced all samples to depth on the latest technology, providing a unified dataset for the next phase of analyses. We processed these samples using the laboratory processes we have previously used for the CCDG project (with minor modifications). Specifically, we generated PCR-free sequencing libraries using unique dual indices to avoid the index switching phenomenon that occurs and causes low level sequencing data contamination on the Illumina patterned flow cells. We sequenced these samples on the Illumina NovaSeq 6000 sequencing instrument, with 2x150bp reads. We believe this instrument represents the future for WGS with short-read technology, and it was important to sequence the 1KG samples in a format that is consistent with future large scale sequencing projects. Our automated analysis pipeline for whole genome sequencing matches the CCDG and TOPMed recommended best practices. Sequencing reads were aligned to the human reference, hs38DH, using BWA-MEM v0.7.15. Data are further processed using the GATK best-practices (v3.5), which generates VCF files in the 4.2 format. Single nucleotide variants and Indels are called using GATK HaplotypeCaller (v3.5), which generates a single-sample GVCF. Variant Quality Score Recalibration (VQSR) is performed using dbSNP138 so quality metrics for each variant can be used in downstream variant filtering.
Sample: Coriell GM19764
SAMN00004475 • SRS006567 • All experiments • All runs
Organism: Homo sapiens
Library:
Name: NA19764
Instrument: Illumina NovaSeq 6000
Strategy: WGS
Source: GENOMIC
Selection: RANDOM
Layout: PAIRED
Construction protocol: TruSeq DNA PCR-free
Runs: 13 runs, 721.4M spots, 216.4G bases, 40.9Gb
Run# of Spots# of BasesSizePublished
ERR3240093360,685,727108.2G9Gb2019-03-25
ERR356056829,257,3598.8G2.6Gb2019-10-02
ERR356057030,077,2949G2.7Gb2019-10-02
ERR356057228,709,6278.6G2.5Gb2019-10-02
ERR356057429,824,1008.9G2.6Gb2019-10-02
ERR356057632,287,5399.7G2.9Gb2019-10-02
ERR356057832,523,2479.8G2.9Gb2019-10-02
ERR356058031,165,7329.3G2.8Gb2019-10-02
ERR356058232,581,6669.8G2.9Gb2019-10-02
ERR356058428,397,0118.5G2.5Gb2019-10-02
ERR356058628,913,8728.7G2.5Gb2019-10-02
ERR356058828,286,4528.5G2.5Gb2019-10-02
ERR356059028,661,8288.6G2.5Gb2019-10-02

ID:
7513888

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...